Serveur d'exploration sur la musique en Sarre

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

How do search engines respond to some non-English queries?

Identifieur interne : 000A11 ( Main/Exploration ); précédent : 000A10; suivant : 000A12

How do search engines respond to some non-English queries?

Auteurs : Judit Bar-Ilan [Israël] ; Tatyana Gutman [Israël]

Source :

RBID : ISTEX:B60C6FED4BDC9FEA0D07AE64677EC388A16CD99C

English descriptors

Abstract

In this article, we explored the capabilities of search engines for non-English languages. As a test case, we considered four languages: Russian, French, Hungarian and Hebrew. For each of these languages we tested three general search engines: Alta Vista, FAST and Google and some local search engines. Our results indicate that in the examined cases the general search engines ignore the special characteristics of non-English languages, and sometimes they do not even handle diacritics well. These findings are rather disturbing, since for example Google is very popular in non-English speaking countries as well, and users are either not aware of what they miss when using search tools that do not take into account the structure and the special characteristics of the specific language or have no alternatives but to use these search engines.

Url:
DOI: 10.1177/0165551505049255


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">How do search engines respond to some non-English queries?</title>
<author wicri:is="90%">
<name sortKey="Bar Ilan, Judit" sort="Bar Ilan, Judit" uniqKey="Bar Ilan J" first="Judit" last="Bar-Ilan">Judit Bar-Ilan</name>
</author>
<author wicri:is="90%">
<name sortKey="Gutman, Tatyana" sort="Gutman, Tatyana" uniqKey="Gutman T" first="Tatyana" last="Gutman">Tatyana Gutman</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:B60C6FED4BDC9FEA0D07AE64677EC388A16CD99C</idno>
<date when="2005" year="2005">2005</date>
<idno type="doi">10.1177/0165551505049255</idno>
<idno type="url">https://api.istex.fr/document/B60C6FED4BDC9FEA0D07AE64677EC388A16CD99C/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001294</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">001294</idno>
<idno type="wicri:Area/Istex/Curation">001203</idno>
<idno type="wicri:Area/Istex/Checkpoint">000810</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000810</idno>
<idno type="wicri:doubleKey">0165-5515:2005:Bar Ilan J:how:do:search</idno>
<idno type="wicri:Area/Main/Merge">000A10</idno>
<idno type="wicri:Area/Main/Curation">000A11</idno>
<idno type="wicri:Area/Main/Exploration">000A11</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">How do search engines respond to some non-English queries?</title>
<author wicri:is="90%">
<name sortKey="Bar Ilan, Judit" sort="Bar Ilan, Judit" uniqKey="Bar Ilan J" first="Judit" last="Bar-Ilan">Judit Bar-Ilan</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Israël</country>
<wicri:regionArea>The Hebrew University of Jerusalem</wicri:regionArea>
</affiliation>
</author>
<author wicri:is="90%">
<name sortKey="Gutman, Tatyana" sort="Gutman, Tatyana" uniqKey="Gutman T" first="Tatyana" last="Gutman">Tatyana Gutman</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Israël</country>
<wicri:regionArea>Ex Libris Ltd, Malha Technology Park, Jerusalem</wicri:regionArea>
<wicri:noRegion>Jerusalem</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Journal of Information Science</title>
<idno type="ISSN">0165-5515</idno>
<idno type="eISSN">1741-6485</idno>
<imprint>
<publisher>Sage Publications</publisher>
<pubPlace>Sage CA: Thousand Oaks, CA</pubPlace>
<date type="published" when="2005-02">2005-02</date>
<biblScope unit="volume">31</biblScope>
<biblScope unit="issue">1</biblScope>
<biblScope unit="page" from="13">13</biblScope>
<biblScope unit="page" to="28">28</biblScope>
</imprint>
<idno type="ISSN">0165-5515</idno>
</series>
<idno type="istex">B60C6FED4BDC9FEA0D07AE64677EC388A16CD99C</idno>
<idno type="DOI">10.1177/0165551505049255</idno>
<idno type="ArticleID">10.1177_0165551505049255</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0165-5515</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="Teeft" xml:lang="en">
<term>Alltheweb</term>
<term>Altavista</term>
<term>American society</term>
<term>Apostrophized article</term>
<term>August</term>
<term>Basic form</term>
<term>Capital letter</term>
<term>Chelovek shel</term>
<term>Cilip</term>
<term>Clef</term>
<term>Computer science</term>
<term>Cyber atlas</term>
<term>Diacritic</term>
<term>Different forms</term>
<term>Different meaning</term>
<term>European languages</term>
<term>Exact form</term>
<term>Exact form pages</term>
<term>Exact form results</term>
<term>Exact phrase</term>
<term>Exact search</term>
<term>Example google</term>
<term>Example queries</term>
<term>Feminine genders</term>
<term>Form documents</term>
<term>Francophone pages</term>
<term>Free text</term>
<term>French grammar</term>
<term>French language</term>
<term>Gender</term>
<term>General search engines</term>
<term>Google</term>
<term>Gutman</term>
<term>Gutman table</term>
<term>Hebrew</term>
<term>Hebrew searches</term>
<term>Hebrew university</term>
<term>Http</term>
<term>Imperfect aspect</term>
<term>Indian languages</term>
<term>Information retrieval</term>
<term>Information retrieval algorithms</term>
<term>Information science</term>
<term>Internet</term>
<term>Internet users</term>
<term>July</term>
<term>Language characteristics</term>
<term>Language information retrieval</term>
<term>Language tools</term>
<term>Large number</term>
<term>Latin alphabet</term>
<term>Lectricit</term>
<term>Lecture notes</term>
<term>Local search engines</term>
<term>Local versions</term>
<term>Ludi idut</term>
<term>Major search engines</term>
<term>More results</term>
<term>Morphological analysis</term>
<term>Morphological forms</term>
<term>Morphological search</term>
<term>Noun</term>
<term>Other hand</term>
<term>Other languages</term>
<term>Pages pages</term>
<term>Pages pages pages</term>
<term>Pages pages pages falvak</term>
<term>Partial word</term>
<term>Participle</term>
<term>Perfect aspect</term>
<term>Plural</term>
<term>Popular languages</term>
<term>Popular search engine</term>
<term>Previous observations</term>
<term>Previous search</term>
<term>Previous search pages</term>
<term>Query</term>
<term>Query term</term>
<term>Query word</term>
<term>Results documents</term>
<term>Retrieval</term>
<term>Right truncation</term>
<term>Same language family</term>
<term>Same meaning</term>
<term>Same number</term>
<term>Same root</term>
<term>Search capabilities</term>
<term>Search engine</term>
<term>Search engines</term>
<term>Search results</term>
<term>Search term</term>
<term>Search terms</term>
<term>Search tools</term>
<term>Simple pattern</term>
<term>Sites pages</term>
<term>Sites pages pages</term>
<term>Small number</term>
<term>Special characteristics</term>
<term>Truncation</term>
<term>User</term>
<term>Verb</term>
<term>Verb particles</term>
<term>Verbal forms</term>
<term>Vowel alternations</term>
<term>Walla</term>
<term>Whole word</term>
<term>Word forms</term>
<term>Words pages</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">In this article, we explored the capabilities of search engines for non-English languages. As a test case, we considered four languages: Russian, French, Hungarian and Hebrew. For each of these languages we tested three general search engines: Alta Vista, FAST and Google and some local search engines. Our results indicate that in the examined cases the general search engines ignore the special characteristics of non-English languages, and sometimes they do not even handle diacritics well. These findings are rather disturbing, since for example Google is very popular in non-English speaking countries as well, and users are either not aware of what they miss when using search tools that do not take into account the structure and the special characteristics of the specific language or have no alternatives but to use these search engines.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Israël</li>
</country>
</list>
<tree>
<country name="Israël">
<noRegion>
<name sortKey="Bar Ilan, Judit" sort="Bar Ilan, Judit" uniqKey="Bar Ilan J" first="Judit" last="Bar-Ilan">Judit Bar-Ilan</name>
</noRegion>
<name sortKey="Gutman, Tatyana" sort="Gutman, Tatyana" uniqKey="Gutman T" first="Tatyana" last="Gutman">Tatyana Gutman</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Sarre/explor/MusicSarreV3/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A11 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000A11 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Sarre
   |area=    MusicSarreV3
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:B60C6FED4BDC9FEA0D07AE64677EC388A16CD99C
   |texte=   How do search engines respond to some non-English queries?
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Sun Jul 15 18:16:09 2018. Site generation: Tue Mar 5 19:21:25 2024